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Abstract 

This paper advocates the exploration of the full state of 
recorded real-time strategy (RTS) games, by human or 
robotic players, to discover how to reason about tactics and 
strategy. We present a dataset of StarCrafJJ] games encom- 
passing the most of the games' state (not only player's or- 
ders). We explain one of the possible usages of this dataset 
by clustering armies on their compositions. This reduction of 
armies compositions to mixtures of Gaussian allow for strate- 
gic reasoning at the level of the components. We evaluated 
this clustering method by predicting the outcomes of battles 
based on armies compositions' mixtures components. 



Introduction 

Real-time strategy (RTS) games AI is not yet at a level high 
enough to compete with trained/skilled human players. Par- 
ticularly, adaptation to different strategies (of which army 
composition) and to tactics (army moves) are strong indi- 
cators of human-played games (Hagelback and Johansson 



2010 1. So, while micro-management (low level units con- 



trol) has known tremendous improvements in recent years, 
the broadest high-level strategic reasoning is not yet an ex- 
emplary feature neither of commercial games nor of Star- 
Craft AI competitions' entries. At best, StarCraft bots have 
an estimation of the available technology of their opponents 
and use rules encoding players' knowledge to adapt their 
strategy. We believe that better strategic reasoning is a mat- 
ter of abstracting and combining the low level states at an 
expressive higher level of reasoning. Our approach will be 
to learn unsupervised representations of low-level features. 

We worked on StarCraft: Brood War, which is a canon- 
ical RTS game, as Chess is to board games. It had been 
around since 1998, has sold 10 millions licenses and was 
the best competitive RTS for more than a decade. There are 
3 factions (Protoss, Terran and Zerg) that are totally differ- 
ent in terms of units, build trees / tech trees (directed acyclic 
graphs of the buildings and technologies) and thus game- 
play styles. StarCraft and most RTS games provide a tool to 
record game logs into replays that can be re-simulated by 



Copyright © 2012, Association for the Advancement of Artificial 
Intelligence (www.aaai.org). All rights reserved. 

'StarCraft and its expansion StarCraft: Brood War are trade- 
marks of Blizzard Entertainment™ 



the game engine. That is this trace mechanism that we used 
to download and simulate games of professional gamers and 
highly skilled international competitors. 

This paper is separated in two parts. The first part explains 
what is in the dataset of StarCraft games that we put to- 
gether. The second part showcases army composition reduc- 
tion to a mixture of Gaussian distributions, and give some 
evaluation of this clustering. 

Related Work 

There are several ways to produce strategic abstractions: 
from using high-level gamers' vocabulary, and the game 
rules (build/tech trees), to salient low-level (shallow) fea- 
tures. Other ways include combining low-level and higher- 
level strategic representation and/or interdependencies be- 
tween states and sequences. 

Case-based reasoning (CBR) approaches often use ex- 
tensions of build trees as state lattices (and sets of tactics 
for each state) as for (|Aha, Molineaux, and Ponsen 2005; 



Ponsen and Spronck 2004} in Wargus. |Ontan6n et al. ( 2007 
base their real-time case-based planning (CBP) system on 
a plan dependency graph which is learned from human 
demonstration in Wargus. In (Mishra, Ontanon, and Ram 
2008), they use "situation assessment for plan retrieval" 
from annotated replays, which recognizes distance to behav- 
iors (a goal and a plan), and selected only the low-level fea- 
tures with the higher information gain. Hsieh and Sun ( 2008 ) 
based their work on (Aha, Molineaux, and Ponsen 2005) and 
used StarCraft replays to construct states and building se- 
quences. Strategies are choices of building construction or- 
der in their model. 

Schadd, Bakk es, and Spronck| ( |2007) > describe opponent 
modeling through hierarchically structured models of the 
opponent behavior and they applied their work to the Spring 
RTS game (Total Annihilation open source clone). |Balla| 
and Fern| ( 2009 1 applied upper confidence bounds on trees 



(UCT: a Monte-Carlo planning algorithm) to tactical assault 
planning in Wargus, their tactical abstraction combines units 
hit points and locations. In ( |Synnaeve and B essiere 2 01 lb| l, 
they predict the build trees of the opponent a few build- 
ings before they are built. Another approach is to use the 
gamers' vocabulary of strategies (and openings) to abstract 
even more what strategies represent (a set of states, of se- 
quences and of intentions) as in ( ] Weber and Mateas~2 009; 



Synnaeve and Bessiere 201 la). |Dereszynski et al. (2011 
used an hidden Markov model (HMM) whose states are ex 
tracted from (unsupervised) maximum likelihood on a Star- 
Craft dataset. The HMM parameters are learned from unit 
counts (both buildings and military units) every 30 sec- 
onds and "strategies" are the most frequent sequences of the 
HMM states according to observations. 

Few models have incorporated army compositions in their 
strategy abstractions, except sparsely as an aggregate or 
boolean existence of unit types. Most strategy abstractions 
are based on build trees (or tech trees), although a given 
set of buildings can produce different armies. What we will 
present here is complementary to these strategic abstractions 
and should help the military situation assessment. 

Dataset 

We downloaded more than 8000 replays to keep 7649 un- 
cormpted, 1 vs. 1 replays from professional gamers leagues 
and internati onal tournaments of StarCraft, from special- 



ized websites 



n We then ran them using Brood War APf] 



and dumped: units' positions, regions' positions, pathfind- 
ing distance between regions, resources (every 25 frames), 
all players' orders, vision events (when units are seen) and 
attacks (types, positions, outcomes). Basically, we recorded 
every BWAPI event, plus interesting states and attacks. The 
dataset is freely available ^], the source code and a documen- 
tation are also providecj^ 

Regions 



Forbus, Mahoney, and Dill (2002) have shown the impor- 
tance of qualitative spatial reasoning, and it would be too 
space-consuming to dump the ground distance of every po- 
sition to any other position. For these reasons, we discretized 
StarCraft maps in two types of regions: 

• Brood War Terrain Analyze^ produced regions from a 
pruned Voronoi diagram on walkable terrain (Perkins 
2010 1. Chokes are the boundaries of such regions. 



• As battles often happens at chokes, we also produced 
choke-dependent regions (CDR), which are created by do- 
ing an additional (distance limited) Voronoi tessellation 
spawned at chokes. This regions set is 

CDR — (regions \ chokes) U chokes 

Attacks 

We trigger an attack tracking heuristic when one unit dies 
and there are at least two military units around. We then up- 
date this attack until it ends, recording every unit which took 
part in the fight. We log the position, participating units and 
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fallen units for each player, the attack type and of course the 
attacker and the defender. Algorithm[T]shows how we detect 
attacks. 

We annotated attacks by four types (but researchers can 
also produce their own annotations given the state available): 

• ground attacks, which may use all types of units (and so 
form the large majority of attacks). 

• air raids, air attacks, which can use only flying units. 

• invisible (ground) attacks, which can use only a few spe- 
cific units in each race (Protoss Dark Templars, Terran 
Ghosts, Zerg Lurkers). 

• drop attacks, which need a transport unit (Protoss Shuttle, 
Terran Dropship, Zerg Overlord with upgrade). 



Algorithm 1 Simplified attack tracking heuristic for extrac- 
tion from games. The heuristics to determine the attack type 
and the attack radius and position are not described here. 
They look at the proportions of units types, which units are 
firing and the last actions of the players. 

list tracked -.attacks 

function UNIT_DEATH_EVENT(wiii) 

tmp <— tracked ^attacks. which _contains(unit) 
if tmp 7^ then 

tmp.update(unit) > update(tmp, unit) 
else 

tracked_attacks.push(attack(unit)) 
end if 
end function 

function ATTACK(urai) t> new attack constructor 

> self O- this 
self .convex -hull <— def aultJiull(unit) 
self .type 4— attack -type(update(self, unit)) 
return self 
end function 

function VPDATE(attack, unit) 

attack. update _hull(unit) t> takes units ranges into 
account 

c <— get_context(attack.convexJiull) 

sel f .units Jnvolved.update(c) 

self .tick 4— default J,imeout() 

return c 
end function 
function tick_update 

self .tick 4— self .tick — 1 

if self .tick < then 
self.destructQ 

end if 
end function 



Information in the dataset 

Table Q] shows some metrics about the dataset. Note that 
the numbers of attacks for a given race have to be divided 
by (approximatively) two in a given non-mirror match-up. 
So, there are 7072 Protoss attacks in PvP and there are not 
70,089 attacks by Protoss in PvT but about half that. 



match-up 


PvP 


PvT 


PvZ 


TvT 


TvZ 


ZvZ 


number of games 




445 


2408 


2027 


461 


2107 


199 


number of attacks 




7072 


70089 


40121 


16446 


42175 


2162 


mean attacks/game 




15.89 


29.11 


19.79 


35.67 


20.02 


10.86 


mean time (frames) / 


game 


32342 


37772 


39137 


37717 


35740 


23898 


mean time (minutes) / game 


22.46 


26.23 


27.18 


26.19 


24.82 


16.60 


actions issued (game engine) / game 




jjZKJy 


j 1 j44 


^AQQG 

zoyys 




T\ GAG 
ZloOo 


mean regions / game 




19.59 


19.88 


19.69 


19.83 


20.21 


19.31 


mean CDR / game 




41.58 


41.61 


41.57 


41.44 


42.10 


40.70 


mean ground distance 


^Jregion ■<-» region 


2569 


2608 


2607 


2629 


2604 


2596 


mean ground distance 


™]CDR o CDR 


2397 


2405 


2411 


2443 


2396 


2401 



Table 1 : Detailed numbers about our dataset. XvY means race X vs race Y matches and is an abbreviation of the match-up: PvP 
stands for Protoss versus Protoss. 



By running the recorded games (replay) through Star- 
Craft, we were able to recreate the full state of the game. 
Time is always expressed in game frames (24 frames per 
second). We recorded three types of files: 

• general data (* . rgd files): records the players' names, 
the map's name, and all information about events like 
creation (along with morph), destruction, discovery (for 
one player), change of ownership (special spell/ability), 
for each units. It also shows attack events (detected by a 
heuristic, see below) and dumps the current economical 
situation every 25 frames: mineral, gas, supply (count and 
total: maxsupply). 

• order data (* . rod files): records all the orders which are 
given to the units (individually) like move, harvest, attack 
unit, the orders positions and their issue time. 

• location data (* . rid files): records positions of mobile 
units every 100 frames, and their position in regions and 
choke-dependent regions if they changed since last mea- 
surement. It also stores ground distances (pathfinding- 
wise) matrices between regions and choke-dependent re- 
gions in the header. 

From this data, one can recreate most of the state of the 
game: the map key characteristics (or load the map sepa- 
rately), the economy of all players, their tech (all researches 
and upgrades), all the buildings and units, along with their 
orders and their positions. 

Armies composition 

We will consider units engaged in these attacks as armies 
and will seek a compact description of armies compositions. 

Armies clustering 

The idea behind armies clustering is to give one "composi- 
tion" label for each army depending on its composing ratio 
of the different unit types. Giving a "hard" (unique) label for 
each army does not work well because armies contain differ- 
ent components of unit types combinations. For instance, a 
Protoss army can have only a "Zealots+Dragoons" compo- 
nent, but it will often just be one of the components (some- 
times the backbone) of the army composition, augmented 
for instance with "High Templars+Archons". 



Because a hard clustering is not an optimal solution, 
we used a Gaussian mixture model (GMM), which as- 
sumes that an army is a mixture (i.e. weighted sum) of 
several (Gaussian) components. We present the model in 
the Bayesian programming framework (Diard, Bessiere, and 
Mazer 2003} : we first describe the variables, the decompo- 
sition (independence assumptions) and the forms of the dis- 
tribution. Then, we explain how we identified (learned) the 
parameters and lay out the question that we will ask this 
model in the following parts. 

Variables 

• C G [ci . . . cr-], our army clusters/components (C). 
There are K units clusters and K depends on the race 
(the mixture components are not the same for Pro- 
to s s/Terran/Zerg) . 

• U £ ([0, 1] . . . [0, 1]) (length N), our N dimensional unit 
types (U) proportions, i.e. U E [0, 1] N . N is dependent on 
the race and is the total number of unit types. For instance, 
an army with equal numbers of Zealots and Dragoons 
(and nothing else) is represented as {Uzeaiot — 
0.5, V Dragoon = 0.5, Vui ^ Zealot\Dragoon U u t — 
0.0}, i.e. U = (0.5,0.5,0, ... ,0) if Zealots and 
Dragoons are the first two components of the U vector. 
So J2i Ui — 1 whatever the composition of the army. 

Decomposition: For the M battles, the armies composi- 
tions are independent across battles, and the unit types pro- 
portions vector (army composition) is generated by a mix- 
ture of Gaussian components and thus Ui depends on Cj. 

M 

P(C/i...a/,C 1 ...m) =Y[P(Ui\Ci)P(Ci) 

i=l 

Forms 

• P(Ui\Ci) mixture of Gaussian distributions: 

P(Ui\Ci = c) =A% c ,a c 2 ) 

• P(Cj) = Categorical(K,pc): 

(p(Ci=c k )=p k 



Identification (learning): We learned the Gaussian mix- 
ture models (GMM) parameters with the expectation- 
maximization (EM) algorithm on 5 to 15 mixtures with 
spherical, tied, diagonal and full co-variance matrices, using 



scikit-learn (Pedre gosa et al. 201 l| l. We kept the best scor- 
ing models (by varying the number of mixtures) according 
to the Bayesian information criterion (BIC) (Schwarz 1978). 

Let 9 = (fii-.K, Ci-k)> being respectively the K different 
TV-dimensional means G«i:K\) anc > the variances (&f. K ) of 
the normal distributions. Initialize 9 randomly, and let 

M K 

l{8- u) = ¥{u\o) = H P(Ui\e, a = c k )P(a = c k ) 

1=1 fc=l 

Iterate until convergence (of 9): 

1. E-step: Q(0|0«) = E[log L{9; u, C)] 

M K 

= E[logJJ£]P(u i |<7< = c k ,9)P{C l =c fc )] 



2. M-step: 



(*+i) 



= 1 k = l 



argmax ( 



Question: For the ith battle (one army with units u): 

P(Ci\Ui =u) = P(Ci)P(Ui = u\Ct) 

Counter compositions 

In a battle, there are two armies (one for each players), we 
can thus apply this clustering to both the armies. If we have 
K clusters and N unit types, the opponent has K' clusters 
and N' unit types. We introduce EU and EC, respectively 
with the same semantics as U and C but for the enemy. In 
a given battle, we observe u and eu, respectively our army 
composition and the enemy's army composition. We can ask 
P(C|f7 = u) and P(EC\EU = eu). 

As StarCraft unit types have strengths and weaknesses 
against other types, we can learn which clusters should beat 
other clusters (at equivalent investment) as a probability ta- 
ble. We use Laplace's law of succession ("add-one smooth- 
ing") by counting and weighting according to battles results 
(c > ec means "c beats ec", i.e. we won against the enemy): 

n/^t ie^-y \ 1 + P(c)P(ec)count b attic S (c> ec) 

P[C = c\EC — ec) = — — - — 

K + P(ec)countbatti os with(ec) 

Results 

We used the dataset presented in this paper to learn all the 
parameters and perform the benchmarks (by setting 100 test 
matches aside and learning on the remaining of the dataset). 
First, we analyze the posteriors of clustering only one army 
and then we evaluated the clustering as a mean to predict 
outcomes of battles. 

Posterior analysis: Figure [T] shows a parallel plot of army 
compositions. We removed the less frequent unit types to 
keep only the 8 most important unit types of the PvP match- 
up, and we display a 8 dimensional representation of the 
army composition, each vertical axis represents one dimen- 
sion. Each line (trajectory in this 8 dimensional space) rep- 
resents an army composition (engaged in a battle) and gives 



the percentage of each of the unit types. These lines (armies) 
are colored with their most probable mixture component, 
which are shown in the rightmost axis. We have 8 clusters 
(Gaussian mixtures components): this is not related to the 
8 unit types used as the number of mixtures was chosen by 
BIC score. Expert StarCraft players will directly recognize 
the clusters of typical armies, here are some of them: 

• Light blue corresponds to the "Reaver Drop" tactical 
squads, which aims are to transport (with the flying Shut- 
tle) the slow Reaver (zone damage artillery) inside the op- 
ponent's base to cause massive economical damages. 

• Red corresponds to the "Nony" typical army that is played 
in PvP (lots of Dragoons, supported by Reaver and Shut- 
tle). 

• Green corresponds to a High Templar and Archon-heavy 
army: the gas invested in such high tech units makes it 
that there are less Dragoons, completed by more Zealots 
(which cost no gas). 

• Purple corresponds to Dark Templar ("sneaky", as Dark 
Templars are invisible) special tactics (and opening). 

Figure[2]showcases the dynamics of clusters components: 
P(EC t \EC t+l , for Zerg (vs Protoss) for At of 2 minutes. 
The diagonal components correspond to those which do not 
change between t and t + 1 (•*=> t + 2minutes), and so it is 
normal that they are very high. The other components show 
the shift between clusters. For instance, the first line sev- 
enth column (in (0,6)) square shows a brutal transition from 
the first component (0) to the seventh (6). This may be the 
production of Mutalisk^Jfrom a previously very low-tech 
army (Zerglings). 



P(EC-m|EC"{t+l}) 



1 



0.72 
0.64 
0.56 
0.48 
0.40 
0.32 
0.24 
0.16 
0.08 
0.00 



EC~{t+l} 



Figure 2: Dynamics of clusters: P(EC*'\EC t+1 ) for Zerg, 
with At = 2 minutes 

A soft rock-paper-scissors: We then used the learned 
P(C\EC) table to estimate the outcome of the battle. For 

"Mutalisks are flying units which require to unlock several tech- 
nologies and thus for which player save up for the production while 
opening their tech tree. 



MostProbabl.r':h.r:t 



Figure 1: Parallel plot of a small dataset of Protoss (vs Protoss, i.e. in the PvP match-up) army clusters on most important unit 
types (for the match-up). Each normalized vertical axis represents the percentage of the units of the given unit type in the army 
composition (we didn't remove outliers, so most top vertices (tip) represent 100%), except for the rightmost (framed) which 
links to the most probable GMM component. Note that several traces can (and do) go through the same edge. 



that, we used battles with limited disparities (the maximum 
strength ratio of one army over the other) of 1.1 to 1.5. Note 
that the army which has the superior forces numbers has 
more than a linear advantage over their opponent (because 
of focus firin^J, so a disparity of 1 .5 is very high. For in- 
formation, there is an average of 5 battles per game at a 1.3 
disparity threshold, and the numbers of battles (used) per 
game increase with the disparity threshold. 

We also made up a baseline heuristic, which uses the sum 
of the values of the units to decide which side should win. If 
we note v(unit) the value of a unit, the heuristic computes 
Tliunit v{unit) for each army and predicts that the winner 
is the one with the biggest score. For the value of a unit we 
used: 

4 

v(unit) — minerals _value + -gas-value + 50supply 

Of course, we recall that a random predictor would predict 
the result of the battle correctly 50% of the time. 

A summary of the main metrics is shown in Table [2j the 
first line can be read as: for a forces disparity of 1.1, for 
Protoss vs Protoss (first column), 

• considering only military units 

- the heuristic predicts the outcome of the battle correctly 
63% of the time. 

- the probability of a clusters mixture to win against an- 
other (P(C\EC)), without taking the forces sizes into 
account, predicts the outcome correctly 54% of the 
time. 

- the probability of a clusters mixture to win against 
another, taking also the forces sizes into account 
(P(C\EC) x ^2 unit v(unit)), predicts the outcome 
correctly 61% of the time. 



12 Efficiently micro-managed, an army 1.5 times superior to their 
opponents can keep much more than one third of the units alive. 



• considering only all units involved in the battle (military 
units, plus static defenses and workers): same as above. 

Results are given for all match-up (columns) and different 
forces disparities (lines). The last column sums up the means 
on all match-ups, with the whole army (military units plus 
static defenses and workers involved), for the three metrics. 

Also, without explicitly labeling clusters, one can ap- 
ply thresholding to special units (Observers, Arbiters, De- 
filers...) to generate more specific clusters: we did not put 
these results here (they include too much expertize/tuning) 
but they sometimes drastically increase prediction scores, as 
one Observer can change the course of a battle. 

We can see that predicting battle outcomes (even with a 
high disparity) with "just probabilities" of Y(C\EC) (with- 
out taking the forces into account) gives relevant results as 
they are always above random predictions. Note that this is a 
very high level (abstract) view of a battle, we do not consider 
tactical positions, nor players' attention, actions, etc. Also, it 
is better (in average) to consider the heuristic with the com- 
position of the army ("probx heuristic") than to consider the 
heuristic alone, even for high forces disparity. Our heuristic 
augmented with the clustering seem to be the best indica- 
tor for battle situation assessment. These prediction results 
with "just prob.", or the fact that heuristic with P(C\EC) 
tops the heuristic alone, are a proof that the assimilation of 
armies compositions as Gaussian mixtures of cluster works. 

Secondly, and perhaps more importantly, we can view the 
difference between "just prob." results and random guessing 
(50%) as the military efficiency improvement that we can 
(at least) expect from having the right army composition. 
Indeed, for small forces disparities (up to 1 . 1 for instance), 
the prediction based only on army composition ("just prob.": 
63.2%) is better than the prediction with the baseline heuris- 
tic (61.7%). It means that we can expect to win 63.2% of the 
time (instead of 50%) with an (almost) equal investment if 
we have the right composition. Also, when we predict 58.5% 
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PvP 


PvT 


PvZ 


TvT 


TvZ 


ZvZ 


mean 


disparity 


in % 


m 


ws 


m 
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m 


ws 
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ws 
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heuristic 


63 


63 


58 
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65 
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61.7 


1.1 
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62 
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73 


73 


66 


66 


69 


69 


75 


72 
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72 
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70 
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just prob. 


56 


57 


65 
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57 
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61 


59.5 
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75 


73 


73 
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76 


75 


75 


75.7 
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just prob. 


52 
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56 


61 
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56 
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58.2 




prob x heuristic 


75 


76 


74 


75 


72 


72 
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76 


77 


80 


76.2 



Table 2: Winner prediction scores (in %) for the three main metrics. For the left columns ("m"), we considered only military 
units. For the right columns ("ws") we also considered static defense and workers. The "heuristic" metric is a baseline heuristic 
for battle winner prediction for comparison using army values, while "just prob." only considers Y(C\EC) to predict the winner, 
and "prob x heuristic" balances the heuristic's predictions with J2c ec P(C\EC)P(EC). 



of the time the accurate result of a battle with disparity up 
to 1.5 from "just prob.", this success in prediction is inde- 
pendent of the sizes of the armies. What we predicted is that 
the player with the better army composition won (and not 
necessarily the one with more or more expensive units). 

Conclusion 

We delivered a rich StarCraft dataset which enables the 
study of tactical and strategic elements of RTS gameplay. 
Our (successful) previous works on this dataset include 
learning a tactical model of where and how to attack (both 
for prediction and decision-making), and the analysis of 
units movements. We provided the source code of the ex- 
tracting program (using BWAPI), which can be run on other 
replays. We proposed and validated an encoding of armies 
composition which enables efficient situation assessment 
and strategy adaptation. We believe it can benefit all the 
current StarCraft AI approaches. Moreover, the probabilis- 
tic nature of the model make it deal natively with incomplete 
information about the opponent's army. 
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